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We present theory and experiment for the task of discriminating two nonorthogonal states, given 
multiple copies. We implement several local measurement schemes, on both pure states and states 
mixed by depolarizing noise. We find that schemes which are optimal (or have optimal scaling) 
without noise perform worse with noise than simply repeating the optimal single-copy measurement. 
Applying optimal control theory, we derive the globally-optimal local measurement strategy, which 
outperforms all other local schemes, and experimentally implement it for various levels of noise. 
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Quantum control — the application of control theory to 
quantum systems — offers powerful tools to enable quan- 
tum technologies to function robustly in the presence of 
noise and device imperfections [TJ [S] [3J |U [5], and to 
simplify protocols by reducing the need for entangling 
operations or collective measurements [6] |7]. One such 
tool is adaptive measurement, wherein one adapts fu- 
ture measurements based on the outcomes of previous 
ones pQ. Quantum control based on adaptive measure- 
ments has been used to improve the measurement of an 
optical phase |4j [8] [9] . Here, we consider the problem of 
quantum state discrimination, and demonstrate experi- 
mentally that adaptive local measurements can discrimi- 
nate pure states better than nonadaptive ones. Moreover, 
we show that in the presence of noise, which is unavoid- 
able in practice, the full power of optimal control theory 
is required to derive the globally-optimal adaptive (lo- 
cal) measurement scheme, which we then experimentally 
implement. 

The task of state discrimination is a fundamental prim- 
itive in many fields of quantum information science, 
including quantum communications, cryptography, and 
computing. If a quantum system is prepared in one of 
several possible states, this preparation can only be de- 
termined with certainty if the possible states are all mu- 
tually orthogonal. For nonorthogonal states, two com- 
plementary tasks are often considered pQ: minimizing 
the likelihood of either an incorrect result (an error) [TO] , 
or of an inconclusive result with no errors jTTJ [T5J [T3] . 

In this Letter, we consider the minimum-error discrim- 
ination of two nonorthogonal qubit states, given N iden- 
tical copies of the state, using only local measurements, 
where the cost function Cm (which is to be minimized) 
is the probability of error. While continuous measure- 
ment schemes for distinguishing two infinite-dimensional 
pure states from a single copy have been studied else- 
where [SI H3], here we consider discrete measurements of 
each of N discrete copies of the state. An optimal so- 
lution for multiple-copy discrimination of pure states is 
given by Helstrom [10] (see also pQ), and takes the form of 
a two-outcome projective measurement on the joint space 



of all copies. For N > 1, this measurement is a nonlo- 
cal (collective) measurement on all copies, and schemes 
in which the same local measurement is performed on 
each system do not achieve this optimal performance [15j. 
Remarkably, it has been predicted theoretically that the 
optimum can be reached using adaptive local measure- 
ments [15 . In this adaptive scheme each system is mea- 
sured locally in the basis that minimizes the probability 
of error immediately after that measurement. We refer 
to this procedure of N adaptive measurements as the 
"locally-optimal local measurement" scheme. As shown 
in [15], for pure states this adaptive measurement per- 
forms just as well as the optimal collective measurement 
on all N copies of the state. In the asymptotic limit 
N — > oo, the scaling of Cm for various state discrimi- 
nation schemes has been well studied [13 EH E] , with 
the notable finding that adaptive local measurements do 
not provide an advantage (in terms of scaling) over fixed 
strategies, even for mixed states [TT] . 

Although the asymptotic performance of state dis- 
crimination schemes is of considerable academic inter- 
est, practical applications will require results for finite 
N, and moreover must consider the effect of noise (i.e. 
mixed states). Here, we adjust the local measurement 
strategies presented in Ref. [13] to function in the pres- 
ence of noise, and analyze their performance theoreti- 
cally and experimentally. Importantly, we discover that, 
with the exception of states that are almost pure, sim- 
ple nonadaptive "unbiased measurements" (see below) 
outperform the locally optimal strategy defined above, 
for a sufficiently large number of copies. However, the 
globally-optimal local measurement strategy, determined 
using optimal control theory, does outperform unbiased 
measurements, even though it does not achieve the opti- 
mum achievable using nonlocal measurements. For ./V 
up to 10, we theoretically predict and experimentally 
demonstrate the performance of each scheme with var- 
ious levels of noise. 

All measurements we consider are projective, in a basis 
{\4>) , |0-7r/2)}, where (f> G [0,tt/2) and \<j>) = cos<f>\x) + 
sin</>|y), for some orthonormal basis {|x), \y)}. Initially, 



we restrict our study to the problem of distinguishing 
between two nonorthogonal pure states, defined without 
loss of generality by \ip±) = cos 6* 1 2;) ±sin#|y). Their 
overlap is c — (ip + \tp_) — cos 29, and they are prepared 
with probability q± (g+ > <?_). The single-copy Hel- 
strom measurement is the projective measurement with 
</> Hel (<? + ) = | arccot ((q + — q-) cot 26). From a measure- 
ment on a single copy, the most likely state given outcome 
+ (— ) is an d the probability of error result- 

ing from this best guess is Cf cl = (1 — \/T — 4<7 + <7_c 2 ) /2. 

For multiple copies, we first build upon the three local 
measurement schemes presented in Ref. [15j . We treat 
these schemes as a prescription for what measurements 
to make, but unlike |15j we employ Bayesian processing 
of all results. This analysis allows us to determine the 
performance of these schemes for distinguishing mixed 
states. For pure states, however, such analysis is equiva- 
lent to the protocols as presented in Ref. [15j . 

1. Unbiased measurements: Independently perform the 
single-copy Helstrom measurement on each copy, and de- 
cide in favor of the state with the highest posterior proba- 
bility. For pure states with q + = q_ , this decision reduces 
to choosing the state with the most favorable outcomes — 
a "majority vote" as in Ref. [15] . When N is even there 
is the potential of a "split vote" , in which case a ran- 
dom guess is made. This scheme performs for general 
states and odd N as CW = Y1>n/i Q{Cf el ) m {l - 
Cf el ) N ~ m , and CX = Cjf.j for even N. For pure states, 
the large N scaling is C™ ~ 7]c N , where 77 is a constant. 

2. Fully biased measurements: Independently perform 
a projective measurement on each copy with <fi = 9; that 
is, in the basis {JVM-) > For pure states, the scheme 
can only guess the state if all measurement results 
are \ip+), otherwise it must guess \ip—)', this is the "una- 
nimity vote" scheme of Ref. [15]. Mixed states, how- 
ever, cannot reliably fulfill unanimity; in general, the 
best guess must be made via Bayesian analysis. For pure 
states, the error probability is Cjy = q+c 2N . Asymptot- 
ically, this scheme thus scales quadratically better than 
unbiased measurements; however, when N is sufficiently 
small, unbiased measurements have better performance. 

3. Locally -optimal local measurements: Perform an op- 
timal single-copy Helstrom measurement Hcl (<7+) on the 
first copy. Via Bayes' theorem, use the result to update 
the prior probability P\ — q + to posterior probability P2 ■ 
Using this, apply a new single-copy Helstrom measure- 
ment Hol (i-2) on the next copy. Repeat this adaptive 
process with updated probabilities P n for all remaining 
copies. The best guess is the state with the higher final 
posterior probability. For pure states, this scheme is glob- 
ally optimal for all N [15 , yielding the same probability 
of error as the collective TV-copy Helstrom measurement: 
C l £ c = (1 - v/l - Aq + q_c™)/2 ~ q + q_c 2N . 

We experimentally demonstrate these schemes with 
9 = 15° and q + = q_ = 1/2, using single photon po- 
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FIG. 1: Layout of the experiment. A polarizing beam split- 
ter (PBS) acts as a filter to ensure high fidelity horizontally- 
polarized photons. A half-wave plate (HWP) in a motor- 
ized rotation stage determines the measurement basis. A 
polarizing beam displacer (PBD) and single-photon count- 
ing modules (SPCMs) discriminate between horizontally and 
vertically polarized photons with high contrast. The result of 
the measurement is fed to a processor which, depending on 
the protocol being tested, adjusts the operation of the HWP 
controller. Single photon inputs are obtained through type- 
I spontaneous parametric downconversion — a 410 nm diode 
laser pumps a BiBO (bismuth borate) crystal, producing pairs 
of 820 nm single photons in the state \HH), with photons in 
separate spatial modes. One of the photon pair is guided to 
the input of the experiment through a single-mode optical fi- 
bre. The other photon is guided directly to a single-photon 
counting module. Detection in coincidence ensures high fi- 
delity single-photons are measured in the experiment. 



larization to encode the two pure states we wish to dis- 
criminate; see Fig. [T] Within the experiment, horizontal 
photon polarization implements the |x) and vertical po- 
larization implements the \y) basis states. A half-wave 
plate (HWP) determines the measurement basis. The 
measurement outcomes are entirely dependent on the rel- 
ative angle between the state and the measurement axes, 
and not on any global orientation of the state or mea- 
surement axes. Therefore, we do not separately prepare 
the two states \ip+) and \ip-), but rather always prepare 
\ip+) and offset the measurement axes by an angle 29 
for experiments on \tp—). A high-contrast-ratio polariz- 
ing beam displacer and single photon counting modules 
implement the orthogonal measurement outcomes. The 
polarization contrast ratio achievable with the appara- 
tus was measured to be better than 0.9999 (the Bayesian 
processing assumes perfect visibility) . The results of run- 
ning each of the three algorithms in the experiment, and 
their theoretical predictions, are shown in Fig. [2] — the 
experimental results correspond well with the theoretical 
predictions. 

We now turn to the performance of these schemes in 
the presence of noise, i.e., for mixed states. This situa- 
tion describes the addition of noise due to, for example, 
transmission over a noisy channel, as well as imperfect 
measurements. In particular, we consider uniform depo- 
larizing noise on qubits |T5] of strength < v < 1, so 
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FIG. 2: Probability of error Cjv in iV-copy state discrimina- 
tion using various schemes in the absence of noise. Lines rep- 
resent theoretical predictions; points represent experimental 
data, each 2000 measurements. Error bars represent one stan- 
dard deviation of the mean of a binomial distribution. The 
locally-optimal local measurement scheme performs best for 
all N; where N = 1 it is equivalent to unbiased measurements, 
both schemes using one single-copy optimal measurement. 



that the two states are now 



P± = hi 1 + i 1 ~ f)(Zcos20±Xsin20)]. 



(1) 



Here X = \x) (y\ + \y) (x\, and Z = \x) (x\ - \y) (y\ are 
Pauli operators. This is simulated experimentally by 
performing bit, phase, and bit-phase flips in the mea- 
surement basis, each with probability v/A. Because the 
noise is depolarizing, the angles for the fixed measure- 
ment schemes are the same in the mixed state case as in 
the pure case. Even so, noise will clearly have a detrimen- 
tal effect on the performance of the schemes described 
above. Indeed, it is now the case that no local scheme 
can achieve the globally optimal performance achievable 
with a collective measurement. 

We have calculated the respective error probabilities 
CV exactly as a function of noise; see Figs. [3] and [4] Both 
the fully biased measurement scheme and the locally- 
optimal adaptive scheme lose their superiority over un- 
biased measurements as v is increased. Our theoretical 
analysis confirms this behavior for general and q + , with 
the value of v at which the error probability curves cross 
depending on 0, q + , and N. 

The locally optimal scheme maximizes the discriminat- 
ing power of each measurement individually, but that is 
not the same as maximizing the discriminating power of 
all N measurements together (even when restricting to 
local, not collective, measurements). Because the locally- 
optimal local measurement scheme is evidently not the 
globally-optimal local measurement scheme in general, 
we now turn to finding such a scheme. 

4- Globally- optimal local measurements: To determine 
the optimal discrimination scheme using local adaptive 
measurements, we use dynamic programming |19j . This 



will in general yield an adaptive scheme that depends 
explicitly on the total number of measurements N that 
will be performed, unlike the locally optimal scheme. 
The scheme is defined by a table of measurement an- 
gles, with rows corresponding to n, the copy to be mea- 
sured (1 < n < N), and columns corresponding to P n , 
the probability prior to the nth measurement that the 
prepared state is \tp+), conditioned on the measurement 
results of the first n — 1 copies (Pi = q + ). Thus, at the 
nth step, we consult the table to obtain the measurement 
angle 4> n (P n ) to be used. The result of this measurement 
is then used to calculate a posterior P„+i via Bayes' the- 
orem, and we proceed to the next step. Linear interpola- 
tion resolves the discreteness in the table's representation 
of P n (here we use 2501 samples). 

We construct this table as follows. In all cases, the 
optimal measurement on the final copy n — N must 
be the single-copy Helstrom measurement, <Pn{Pn) — 
cj> (Pjv), as this measurement will minimize the error 
probability C/v regardless of the previous measurement 
choices. Starting from this fact, the globally-optimal 
local measurement scheme for N copies is constructed 
in reverse. Using the recursive relationship between 
the expected error probabilities after n and n + 1 mea- 
surements, the penultimate measurement 4>n-i(Pn-i) 
that minimizes C/v can be found by a numerical search, 
given P/v-i- When calculated for samples of the range 
< Pjv-i < 1, this defines row N—l of the measurement 
table. 

The optimal measurement that precedes the final two 
measurements can similarly be obtained by minimiz- 
ing the expected error probability over the measurement 
4>N-i{Pn-2) for some P/v-2- This constructs row N — 2 
of the measurement table. Continuing this analysis, we 
construct a table of N measurement settings, defining 
</>| lo (P„), which results in the lowest final error probabil- 

ity cf. 

To determine the performance of non-globally-optimal 
measurements when noise is present one can use the 
same procedure, but with a nonoptimal measurement 
choice. For example, the locally-optimal local measure- 
ment cj) l ° c (Pn) defines C]^ c . As the measurements are 
Markovian, sampling is unnecessary, and for moderate 
N as used in this paper, the probability of error can be 
calculated exactly. 

The globally-optimal local measurement scheme con- 
structed according to the above procedure reduces to the 
locally optimal scheme in the noiseless case. For high 
noise, we have found numerically that the measurement 
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for all but the final few copies approaches 



7r/4, as for unbiased measurements when q + = q_. Its 
performance also approaches that of unbiased measure- 
ments, in this regime where v is not small. Importantly, 
for all v > 0, we have Cf|° < min(C]^ c , C^ n ) for N > 3, 
as expected. But we also have CfJ° > C™ 1 , the probabil- 
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FIG. 3: Error probability CV of discrimination schemes un- 
der v — 10% depolarizing noise. Points represent 1000 ex- 
perimental discriminations. The addition of noise detrimen- 
tally impacts the locally-optimal local measurement scheme 
more than the unbiased scheme. Indeed, theory predicts that 
the latter outperforms the former for N = 5, N = 7, and 
N > 9. The globally-optimal local measurement scheme per- 
forms better than all other local measurement schemes in the 
presence of noise. The theoretical optimal collective measure- 
ment cost is plotted for comparison. 
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FIG. 4: Error probability Cat of discrimination schemes under 
various levels of noise v for N = 10 measured copies. Points 
each represent 2000 (y = 0) or 1000 (y > 0) experimental dis- 
criminations. Here, unbiased measurements outperform the 
locally-optimal local measurement scheme for noise v > 10%. 

ity of error from a collective measurement over all copies, 
achieved by the A^-copy Helstrom measurement [TJ [TO] . 
This is illustrated in Figs. [3] and [4] 

We experimentally investigate all four local measure- 
ment schemes in the presence of 2%, 10%, 30%, and 60% 
noise. The results for v = 10% noise for N up to 10 are 
shown in Fig. [3] and for fixed N = 10 under various noise 
in Fig. [4] Further results may be found in supplemen- 
tary material, below. Theoretical curves are determined 
numerically using the dynamic programming method de- 
scribed above. The discontinuities in the gradient of 
arise due to the discreteness of the number of outcomes 



required to guess In all cases, experimental data 

agree with theoretical predictions, within expected sta- 
tistical variation. The globally optimal scheme has the 
best performance for all levels of noise and for all N. 

We have shown that local adaptive iV-copy discrimi- 
nation schemes which are optimal in the noiseless regime 
are significantly impacted by the addition of noise. The 
locally-optimal local measurement scheme, in particular, 
performs more poorly than nonadaptive unbiased mea- 
surements. Subsequently, by a dynamic programming 
analysis, we have demonstrated the adaptive local mea- 
surement scheme that is globally optimal, having, in all 
cases, the lowest probability of an incorrect discrimina- 
tion of any local measurement scheme for any N. In 
addition to illuminating part of the fundamentally in- 
teresting problem of quantum state discrimination, our 
work provides an insight into the fragility of idealized 
models practically applied, and demonstrates the useful- 
ness of optimal quantum control techniques in mitigating 
the real-world issues that face the application of quantum 
technologies. 
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Here we present additional details and results of 
iV-copy discrimination schemes for two non-orthogonal 
quantum states under depolarizing noise. 



NOISE SIMULATION 

Single-qubit depolarising noise is simulated in the ex- 
periment by the random application of bit, phase, and 
bit-phase flips in the measurement basis, each with prob- 
ability u/A, where < v < 1 quantifies the amount of 
noise. For each copy, the appropriate measurement, de- 
scribed by the angle </>, is first calculated according to 
the scheme being tested. Before passing to the half- wave 
plate controller, this angle is passed through a depolaris- 
ing filter subroutine, isolated from the main discrimina- 
tion routines. The subroutine will perform the operation 
4> — > <j> with probability 1 — |z/, or <j> — > n/2 — <j>, (f> — > —(f), 
or <j) — > 7r/2 + 0, each with probability v/A, implementing 
identity, bit, phase, and bit-phase flip operations respec- 
tively. This realizes a noisy measurement equivalent to a 
depolarising channel of strength v. 



DERIVATION OF GLOBALLY-OPTIMAL LOCAL 
MEASUREMENTS 

The table of measurements that defines the globally- 
optimal local measurement scheme for N copies is con- 
structed as follows. Let P n +i be the probability (i.e. 
the observer's credence) that the prepared state is | ?/>+), 
conditioned on the n measurement results from the first 
n copies. Let R&° be the expected value (calculated 
after the nth measurement) of the final probability of 
error after measuring the remaining N — n copies us- 
ing globally-optimal local measurements. Let <f> n be a 
parameter defining the measurement basis for the nth 
measurement, and D n be its outcome. We begin with 
the condition that, after all copies have been measured, 
i?| r °(Pjv+i) = min(Pjv+i, 1 — Pn+i), and proceed iter- 
atively in reverse. Given P| lo (P n+ i) for some n > 0, 
and measurement angle <f> n , it is evident that at the pre- 
vious step the final error probability R n _i(P n ,(j) n ) after 
measuring the nth copy with angle <j) n and the remaining 
N — n copies using globally-optimal local measurements 



is 

Rn-l{Pn,4>n) (2) 

= ^ Pr [D n \P n , 4> n ] Rt (P„ +1 (£>„, P n , K)) , 

D n 

where we use B ayes' theorem to evaluate 

P n+1 {D n ,P n ,<l> n )- Fl[DnlPn ^ n] • (3) 

Here Vr[D n \P n ,<p n ] = Pr [D n \+, <p n ] P n + 
Pr [.D„| — ,(f> n ] (1 — Pn)- The globally optimal mea- 
surement at step n — 1 is defined by finding the angle 
(f>f^°(P n ) that minimizes R n -i, and this defines 

Rt^^Rn-liPn^tiPn))- (4) 

This process is then continued down to n = 1. The 
probability of error for this scheme is thus Cf^° — 
R,Q°(q + ) (since Pi = q + ). Once this analysis is com- 
pleted, the values stored in (P„) define the measure- 
ments to be performed within the experiment. 



ADDITIONAL RESULTS 

Following are plots of the error probability of unbi- 
ased measurements, fully biased measurements, locally- 
optimal local measurements, globally-optimal local mea- 
surements, and optimal collective measurements, under 
various levels of depolarizing noise v, for N up to 10. 
Shown are theoretical calculations (lines) and experimen- 
tal data for 1000 discriminations (points, local schemes 
only), with error bars plus or minus one standard devia- 
tion of the mean. In all cases, the globally-optimal local 
measurement scheme, constructed using optimal control 
theory and dynamic programming as detailed above, has 
the best performance of any local measurement scheme 
for any N . 
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FIG. 1: Error probability Cjv of discrimination schemes under 
v — 2% depolarizing noise. 
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FIG. 3: Error probability Cjv of discrimination schemes under 
v = 60% depolarizing noise. 
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FIG. 2: Error probability Cjv of discrimination schemes under 
v = 30% depolarizing noise. 



